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We present a new asymptotic bipartite entanglement distillation protocol that outperforms all 
existing asymptotic schemes. This protocol is based on the breeding protocol with the incorpora- 
tion of two-way classical communication. Like breeding, the protocol starts with an infinite number 
of copies of a Bell-diagonal mixed state. Breeding can be carried out as successive stages of par- 
tial information extraction, yielding the same result: one bit of information is gained at the cost 
(measurement) of one pure Bell state pair (ebit). The basic principle of our protocol is at every 
stage to replace measurements on ebits by measurements on a finite number of copies, whenever 
there are two equiprobable outcomes. In that case, the entropy of the global state is reduced by 
more than one bit. Therefore, every such replacement results in an improvement of the protocol. 
We explain how our protocol is organized as to have as many replacements as possible. The yield 
is then calculated for Werner states. 

PACS numbers: 03.67.Mn 
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I. INTRODUCTION 



Quantum entanglement is an important resource in 
many applications of quantum cryptography and quan- 
tum communication. Some well-known examples are 
teleportation quantum key distribution ^^j a-nd su- 
perdense coding . These applications require pure and 
maximally entangled qubit pairs, called Bell state pairs, 
that are shared by two remote parties. One party pre- 
pares the Bell states and sends one qubit to the other 
party via some quantum channel. In a realistic setting, 
this channel is not perfect: uncontrollable influences of 
the environment (decoherence) will affect the qubit sent, 
resulting in qubit pairs that are in a mixed state and 
unsuitable for the application in mind. 

Entanglement distillation is the process of applying lo- 
cal operations (local with respect to the parties) to the 
mixed state qubit pairs combined with classical commu- 
nication (LOCC) in order to obtain pure Bell state pairs. 
Typically, we assume stationarity of the quantum chan- 
nel, affecting all qubit pairs in the same way. As a result, 
we have n copies of the same mixed two-qubit state p. 
Protocols like hashing or breeding 0, |^ have a net out- 
put of m qubit pairs whose states approach pure Bell 
states if n goes to infinity. We call such protocols asymp- 
totic and the fraction of distilled Bell states per initial 
copy the yield m/n. Breeding differs from hashing by the 
use of an initial pool of predistilled Bell state pairs, but 
these protocols are known to be equivalent. The classical 
communication between the parties in both hashing and 
breeding is only in one direction. With two-way commu- 
nication, higher yields can be achieved Q. Indeed, the 
two parties can choose between alternative courses of the 
protocol based on information on intermediate stages. 
We call such a protocol adaptive. 
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Entanglement distillation protocols, apart from being 
necessary for applications, are also interesting for theo- 
retical purposes. The important entanglement measure 
entanglement of distillation of p is defined as the maxi- 
mal asymptotic yield. It is lower bounded by the yields 
of all distillation protocols and in itself a lower bound 
for all sensible measures of entanglement |^ 0] ■ There- 
fore, significantly improving distillation protocols brings 
us closer to a better understanding of the irreversible na- 
ture of entanglement manipulation. 

Our protocol is based on the breeding protocol, with 
the incorporation of two-way communication. Until re- 
cently, the breeding or hashing protocol were the only ex- 
isting asymptotic protocols, apart from the slightly bet- 
ter performing variant of Ref. Adaptive upgrades 
of breeding/hashing mostly consist of breeding/hashing 
preceded by non-asymptotic recurrence-like schemes, re- 
sulting in higher yields only for low-fidelity states [j, 0, 
liol [of . Also the adaptive protocols of Ref. ^3 violate 
all kinds of one-way communication quantum error cor- 
rection bounds, yet asymptotically do not perform any 
better than breeding/hashing. But VoUbrecht and Ver- 
straete 'l3] came up with protocols that introduce two- 
way communication on an asymptotic level, improving 
breeding/hashing for all states. However, their protocols 
are rather ad hoc: further improvements are suggested 
by exhaustive searches over a rather untransparent de- 
cision space. We will explain the principles that are at 
the basis of the improvements and create new protocols 
that, by exploiting these ideas, outperform all existing 
schemes significantly. 

Like all protocols mentioned, our protocols work for 
copies of a state p that is diagonal in the Bell-basis, also 
called Bell-diagonal. If p is not Bell-diagonal, separate 
optimal single-copy distillation protocols can be applied 
to each copy to make them Bell-diagonal ^3 ■ A nice fea- 
ture of Bell-diagonal states is that they can be entirely 
interpreted in classical information theory. Indeed, the 
state p®" is equivalent to a statistical ensemble of tensor 
products of Bell states. In the breeding protocol, infor- 
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mation on p®" is gathered from measurements on the BeU 
state pairs (ebits) of the initial pool, after letting them lo- 
cally interact with p®". One bit of information is gained 
for every ebit measurement, or equivalently, the entropy 
of /5®" is reduced by one bit. When the entropy of p®" is 
reduced to zero, the ensemble has become a pure tensor 
product of Bell states. As will be explained in Sec. IIIII 
breeding can be divided into successive stages of partial 
information extraction, yielding an equivalent protocol. 
The basic principle of our protocol is at every stage to re- 
place measurements on ebits by measurements on a finite 
number of copies, whenever there are two equiprobable 
outcomes. It can be verified that the entropy of the global 
state is then reduced by more than one bit. This is be- 
cause whenever an observable is measured, the state is 
projected onto the eigenspace of the observable, thereby 
eliminating the entropy associated with the outcomes of 
observables not commuting with the one measured. We 
will explain how our protocol is organized as to have as 
many replacements as possible. 

This paper is organized as follows. In the preliminary 
section ^ an overview is given of the binary language 
in which our protocols are efficiently described. We also 
explain the two relevant ways of extracting information 
on an unknown tensor product of Bell states. In Sec. IIIII 
we briefly explain the breeding protocol, partial breeding 
and the improvement of Ref. In Sec. lIVI we elaborate 
on the principle of entropy reduction, on which our pro- 
tocol is mainly based. The way equiprobable outcomes 
are forced and other ideas simplifying our protocols are 
then described in Sec. |V| We also give a method for nu- 
merically calculating the yield. This is finally illustrated 
for Werner states in Sec. IVII We conclude in Sec. IVIII 



II. PRELIMINARIES 

In this section we give a short overview of the bi- 
nary language in which distillation protocols are often 
expressed. For a detailed discussion and proofs of these 
results we refer to Refs. H [Hill El. 



A. Binary representation of Bell states, Pauli 
operators and Clifford operators 

Bell states can be represented by assigning two-bit vec- 
tors to the Bell states as follows 

l-f^) = ;i3(|00) + |ll)) = |Boo) 

= ^(|01) + |10)) = iBoi) 

I'l'-) = ^(100) -111)) = iBio) 

I*-) - 7i(|01)-|10)) = iBn). 

We consider all Bell states shared by two parties A and 
B. In the following, all "local" operations are local with 
respect to the partition into A and B. In an analogous 



way, the Pauli matrices are identified with two-bit vec- 
tors: 
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For notational convenience, we will often denote a binary 
vector by a string (e.g. 1010 means [10 1 0]"^). A tensor 
product of n Bell states can then be described by a 2n- 
bit vector, e.g. |Boiooii) = l^oi) ® \Bw) ® \Bii). The 
same rule applies for a Kronecker product of Pauli matri- 
ces. The Pauli group is defined to contain all Kronecker 
products of Pauli matrices with an additional complex 
phase factor in {l,i,— 1,— i}, called Pauli operators. In 
the following, we will only consider Hcrmitian Pauli op- 
erators and neglect overall phase factors. 

For all a, 6, s, i G Zj", the following relations hold: 

{l2^®(Tt)\Bs) ^ \Bs+t), 

where "~" denotes equality up to an overall phase factor 
0,0. All addition of binary objects is done modulo 2. 
Two Pauli operators Ua and df, commute if the symplectic 
inner product Ph is equal to zero, or 

T " 1 " 

CTaCTb = (-1)° ^''cTbO-a, whcrC P = /„ ^ ^ . 

A Clifford operator Q maps the Pauli group to itself 
under conjugation, and can be represented by a symplec- 
tic matrix C £ Z^"^^™. 

QcTaQ^ ~ (TCa- 

Symplecticity of C is expressed by C^PC = P. In the 
context of distillation protocols, we have the following 
interesting result ^3 '■ let Q be represented by C and Q* 
be the complex conjugate of Q, then it holds: 

(g ® Q*)\Bs) \Bcs), for aU s e 1?^ . (1) 

B. Information extraction 

Information on an unknown tensor product of n Bell 
states \Bs), s G Zj", in the context of distillation pro- 
tocols, is extracted under the form of an inner product 
r-^s, where r is an arbitrary nonzero 2n-bit vector. We 
will call this action a parity check. This can be done in 
two ways: 

1. by local Clifford operations on the tensor product 
and an appended ebit \Bs)® |i?oo)i followed by the 
local measurement of the ebit; 
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2. by directly performing local measurements on \Bs). 

We explain the two ways in more detail, and call them 
appended ebit measurement (AEM) and bilateral Pauli 
measurement (BPM) respectively. 

By means of local Clifford operations Q, we first 
transform \Bs) ^ |Boo) into \Bs) ® \Bq r^s)- The sym- 
plectic matrix C that corresponds to this action is 
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Then, a Uz measurement is performed on both sides of 
the appended ebit. The product of the outcomes is equal 

T 

to (—1)'^ Indeed, the outcomes of a cr measurement 
performed locally on an ebit correlate as follows: 
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It follows that the product of the outcomes of a bilateral 
(i.e. on both sides) measurement apr on a tensor product 
of Bell states \Bs) equals 

^ ' [O 

An AEM does not affect the state \Bs). Therefore, this 
procedure can be repeated consecutively for different r, 
like in the breeding protocol. However, the same does 
not hold for a BPM. Because our protocol will consist of 
both methods in various combinations, we need to sort 
out how this can be done. In Ref. we showed that, in 
theory, a BPM is equivalent to the following procedure: 

1. perform local Clifford operations that corre- 
spond to a symplectic C of which the last row is 
r'^: such a C can always be found, for every r ^ 0; 

2. then, perform a bilateral Gz measurement on the 
last qubit pair; 

3. finally, apply the inverse of the local Clifford oper- 
ations of the first step. 

Note that the result is no longer a tensor product of Bell 
states, as the last of the qubit pairs is measured in the 
second step, leaving it in a separable state. Since an AEM 
leaves the state \Bs) unaffected, we only need to worry 
about the situation after a BPM. The only irreversible 
step applied is the measurement of the last qubit pair, 
which yields knowledge of r-^s but destroys any other 



information contained by this pair. After this step, we 
are left with the state \Bq^), where C G ^2(rt-i)x2n 
equal to C without the last two rows. The only infor- 
mation on \Bs) left for us to extract is the information 
we can extract from \BQg). Clearly, we can perform par- 
ity checks yielding a^Cs, for all a £ Zj*'"^^''. This is 
equivalent to determining s, for all q G Zj" that sat- 
isfy q^Pr = 0. Indeed, as C is symplectic, all such q 
are in the column space of [C^ r] , or q = C-^a -|- ar, for 

some a G zi^^^ and a G Z2. Since r^s was already 
determined, we know q^s — a^Cs + ar^s by determin- 
ing a^Cs from the new state. In general, every time 
we determine r^s oi \Bs) by a BPM, afterwards we can 
only access q^s where q^ Pr = 0, whatever method we 
use. This should not come as a surprise, because when 
q'^ Pr = 1, the Pauli measurements apr and apg anti- 
commute, so their outcomes cannot be determined both. 

In reality, after a BPM, we should continue working 
with the transformed state represented by Cs. But this 
requires knowledge of the whole matrix C, while the par- 
ity check is specified only by r. As explained in the pre- 
vious paragraph, we can describe all future actions in 
terms of s: we only need to know which BPM have been 
done. This yields a much more transparent description 
of the procotol. 

III. BREEDING IMPROVED 

In this section, we start by briefly explaining the breed- 
ing protocol, which was introduced in Ref. (^]. Basicly, 
information on n copies of a Bell-diagonal mixed state is 
extracted sacrificing ebits until the state is a pure tensor 
product of n Bell states (i.e. zero entropy). We show 
then how the breeding protocol can be divided into suc- 
cessive stages of partial information extraction, yielding 
an equivalent protocol. Depending on the outcome of 
one such stage, a different strategy can be applied, yield- 
ing a protocol that uses two-way communication. We 
call such a protocol adaptive, as it adapts to intermedi- 
ate outcomes. We will explain an improvement of the 
breeding protocol that has been found in this way by 
Vollbrecht and Verstraete For details, we refer to 

Refs. HHIIp. 

The breeding protocol starts from n copies of a Bell- 
diagonal mixed state 

P = ^ Pv\Bv){Bv\- 

The global state p®" is equivalent to a statistical en- 
semble of pure states \Bs), s G Z^", with corresponding 
probabilities (e.g. Pooiioi =PooPiiPoi)- Consequently, 
the state can be regarded as an unknown pure state \Bs). 
The goal now is to determine s. Once we have pinned 
down \Bs), we can transform the state to \Bq) by per- 
forming the unitary transformation as on the B side. 
With probability approaching 1 for large n, this unknown 
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s is contained in the typical set T that has « 2"'^'^'') ele- 
ments [T^ . where 

S{p) = - ^ Pvl0g2Pv 

Consecutive parity checks r-^s, where ah r are random, 
each on average rule out half of T. Consequently, to 
obtain zero entropy (i.e. only one candidate left), about 
nS{p) AEM are needed, each at the cost of one ebit. 
Therefore, the yield of the protocol, which is the number 
of ebits that are distilled for every copy, is equal to 1 — 

Sip). .... 

Partial information on s is extracted by restricting to 
parity checks r'^s, where r is of the form 

r = r' (g) a, 

a is some fixed and finite m-bit vector (m is even and 
divides 2n) and random r' £ Z'^"'^"^ take over the role 
of r. We will call this technique partial breeding. Note 
that it is completely specified by a. Therefore we will 
denote it by PB a. We illustrate how partial breeding 
works with an example. Let a = 1010, and divide s into 
vectors of rn = 4 bits (i.e. m/2 = 2 pairs). Every such 
m-bit vector g is either an element of 0^°'\ if a^g — 0, or 
of if a^g — 1. For this example, we have 

0('') = {0000,0001,0100,0101,1010,1011,1110,1111}, 
ll-^) = {0010,0011,0110,0111,1000,1001,1100,1101}. 

We have for instance 

s= 0010 1110 0110 0011 0001 1101 0100 

g Q(a) l{a) ]^(a) Q(a) ^{a) Q(a)_ 

In the same way as for breeding, a typical set can be 
associated with the distribution of 0'°-' and This 
set has w 2^^^ elements, where 

S'(")(p) = -po(a) log2Po(») -PiM l0g2Pl(")- 

Therefore, we need « ^S''°'\p) AEM to determine a'^ g 
for all m-bit vectors g constituting s, with probability 
close to 1. For this example, we have 

Po(») = POOOO +P0001 + • • • 

PHo.) = PoOlO +P0011 + • • • +P1101- 

We have considered partial information extraction on 
a sequence of identically and independently distributed 
random variables over the set {00,01,10,11}. But the 
same idea can also be applied to the sets 0^"^ and l^'^^. 
Once we have carried out the previous PB step, we know 
for every 4-bit vector (2 pairs), whether it is in 0*^"^ or 
1*^°'. If we bring all vectors in O'"-* together, again we 
have i.i.d. random variables over O''"-' , and again we could 
perform partial breeding, this time for instance PB h = 
0101. Combining this with for instance PB c = 1000 



for 1^"), we get to know for every 4 bits in which of the 
following sets they are: 

= o("' n o^'') = {0000, 0101, 1010, iiii}, 
S2 = o(") n i('') = {0001, 0100, 1011, 1110}, 
5*3 = i^'^) no(=) = {0010,0011,0110,0111}, 
5-4 = i'-^) n i(=) = {1000, 1001, 1100, 1101}. 

It can be verified that the total number of AEM needed 
in the first and second PB step of this example is equal 
to 

-■^ (PSi log2 PSi + • • • + PSi log2 PSi) , 

which is exactly the entropy that is associated with the 
partition into Si, 82^8^, 84 times the number of 4-bit vec- 
tors in s. This is a consequence of the fact that ^3 

8{A,B) = {-pAl0g2PA-PB\0g2PB)+PA8{A)+pB8{B). 

So it is of no importance how a certain situation is at- 
tained, the number of AEM (= the cost in ebits) always 
equals the total information gain. We can continue per- 
forming PB steps in this way until all sets considered are 
singletons. We then have determined s completely, at the 
cost of n8{p) ebits. 

Of course, there is no point in dividing the breed- 
ing protocol in successive stages of partial breeding. In 
Ref. |l3i, 0(") pairs are further purified by breeding, but 
the l(°) pairs are treated differently: on the first pair of 
every 1^°' state, a BPM 10 is performed, yielding the par- 
ity 10 of this pair. As the pair is measured, it is lost, but 
the measurement also provides information on the second 
pair. This one is in {10, 11} if the outcome was +1 and in 
{00, 01} if the outcome was —1. So in both cases, we end 
up with a rank two Bell-diagonal state, for which it has 
been proved that the breeding protocol is optimal 0|. 
The yield of this protocol is calculated in Ref. (l3|, and 
turns out to be greater than that of breeding. But the 
reason why this necessarily must be so, remains obscure. 
We will shed light to this issue in the next section. 

IV. ENTROPY REDUCTION 

The reason why the protocol of Ref. ^3 outperforms 
the breeding protocol, lies in the difference between an 
AEM and a BPM. If a parity check is performed on a 
finite number to/2 of pairs, represented by an ensemble 
of vectors g £ Z™ , the resulting state will have lower en- 
tropy by a BPM than by an AEM. Next to extracting 
information under the form of the parity, a BPM results 
in the mapping of different vectors to the same new vec- 
tor, resulting in an extra entropy reduction. 

To see this, we recall the procedure to carry out a BPM 
explained in Sec. Ill Bl If g is the parity we would like 
to know, we first perform local Cliffords represented by 
a symplectic C £ Z™^™ of which the last row is , 
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followed by a bilateral measurement on the last pair. 
This results in a new state (with one pair less) repre- 
sented by Cg. By the measurement, we learn but 
we also lose g^ where h is the second last row of C. 
This loss causes all g with the same result Cg and out- 
come oF g to be mapped to the same vector Cg. Note 
that the outcomes should be equal as well, otherwise one 
of the two is ruled out. From the symplecticity of C, it 
follows that g and g -I- Pa are mapped together. Indeed, 
GPa — and Pa — 0. Consequently, the new state is 
represented by the ensemble of vectors Cg^ with proba- 
bilities Pg -|-pg+pa. This addition of probabilities results 
in the extra entropy reduction. 

Let us illustrate this with an example. We have two 
pairs represented by an ensemble of 4-bit vectors and we 
perform a BPM 1111. We are left with only one pair 
represented by an ensemble of 2-bit vectors. The proba- 
bilities are 

POOOQ-t-Pllll POQll+PllOO POlQl-t-PlQlQ POl 1Q-|-P10Q1 

if the outcome is +1, and 

Poooi+Piiio Pooio+Piioi Poioo+pioii poiii-l-piooo 

Pl(a) ' Pi(a) ' Pi(a) ' V 

if the outcome is —1. Note that we do not identify these 
probabilities with the two-bit vectors Cg: all future ac- 
tions are described entirely in terms of the original vec- 
tors g, as explained in Sec. Ill Bl If we would have used an 
AEM, then we would still have two pairs, but represented 
only by 8 vectors instead of 16, with probabilities 

POOOO Pllll Poo 11 PllOO POlOl PlOlO POllO PlOOl 
P0(a) ' Po(a) ' Po(a) ' V ^(a) ' V ,,(0.) ' V ^(a) ' P^Ca) ' Po(a) 

if the outcome is +1, and 

POOOl PlllO POOlO PllOl POIOQ PlQll POlll PlQQO 
Pl(a) ' Pi(a) ' Pi(o) ' Pi(a) ' V ^(a) ' V ^(o.) ' V ^(a) ' P^Ca) 

if the outcome is —1. The average difference in entropy 
is equal to 

[-Poooo log2 Poooo - Pllll log2 pnn - . . . 
-poiii log2 Pom - Piooo log2 Piooo] 

+ [(POOOO + Pllll) log2(P0000 + Pllll) + • ■ ■ 
+ (P0111 +Pl000)log2(P0111 +P1OO0)] 

and is always positive. Indeed, for all x, y > 0, we have: 

\-x log2 x-y log2 ?/] + [(a; + y) log2 (x + y)] 

= (x + y)il(^,^), ^ ' 

where 

H{p,l~p) ^ -plog2P" (1 -p)log2(l -p) 

is the binary entropy function, plotted in Fig. Q] 

This plot shows that the entropy reduction, given by 
the right hand side of Eq. ||2Jl, is larger the more the col- 
liding vectors g and g + Pa are equiprobable. If one prob- 
ability relative to the other becomes small, the entropy 
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reduction vanishes. That is the reason why the hashing 
protocol I3| , which is the same as breeding hut the parity 
checks are BPM instead of AEM, has the same yield as 
the breeding protocol: again, we use the fact that almost 
all weight comes from vectors s G T. Since the r are 
completely random, so are s + Pr. Therefore, the prob- 
abilities ~ (pooPoiPioPii)"^^ of s + Pr are infinitesimal 
(as n is large) compared to the probabilities « 2^"'^(''^ of 
s [l3| ■ A variant of hashing , where some of the BPM 
are on a finite number of copies resulting in a nonzero en- 
tropy reduction, performs slightly better than hashing. 

It is clear that we should focus on BPM on small num- 
bers of copies, because there lies the benefit of the en- 
tropy reduction. However, up till now, we have only spo- 
ken of the information gain, but we also have to take the 
cost into account. PB requires AEM, each at the cost 
of one ebit, whereas a BPM is at the cost of one of the 
copies. But as in the end all non-measured copies will 
be pure Bell states, this will not make the difference. By 
construction, every AEM in PB has equiprobable out- 
comes, and therefore yields one bit of information. The 
same does hold for a BPM if r has infinite length and is 
random. Indeed, hashing is equivalent to breeding. But 
if we are to perform small non-random parity checks, the 
outcomes are not necessarily equiprobable and therefore 
yield less than one bit of information. If the outcomes 
are equiprobable, improvement is guaranteed. Note that 
the BPM 10 on the first pair of two 1^°^ pairs does have 
equiprobable outcomes, which explains the improvement 
of Ref. [I3j over breeding. So in some way, we should 
try to spot as many finite equiprobable parity checks as 
possible and carry them out by BPM. 



V. PROTOCOL 

In the following, we will denote the all-zeros m-bit vec- 
tor by Om and the all-ones vector by Im- For any binary 
vector g e Z™, we will denote g + Im by 5. Whenever a 



6 



parity check has been performed on to/2 qubit pairs 
with outcome a G Z2 (we will call a the outcome instead 
of (—1)"), we will denote the resulting state by [a^™-'] 
if the parity check was a BPM and by a^™^ otherwise. 
Recall from Sec. IIVI that the probabilities of [a*^™-*] are 
(up to normalization) jtg +Pg, whereas the probabilities 
of a^™^ are Pg, where in both cases all g satisfy ij^g = a. 



A. Decoupling 

Learning the parity of a number of qubit pairs by par- 
tial breeding or BPM causes statistical dependence of 
the pairs involved, which makes the continuation of the 
protocol very complicated. However, this statistical de- 
pendence can be undone, which we refer to as decoupling. 
The idea of decoupling is best explained by an example. 
Suppose by PB 1111, we learn for every two copies of a 
Bcll-diagonal qubit pair its state a'^'^\ Where the states 
of the copies were independent before, this obviously no 
longer holds afterwards. But if next we perform PB 11 on 
all first pairs, yielding for a particular first pair its state 
where the state of both pairs was a'^^\ we now have 
two independent pairs /3^^^ and (a -I- /3)'^^. Indeed, we 
have learned the parities 1111 a and 1100 /3, which 
is equivalent to knowing 1100 (3 and 0011 a -\- fi, 
or 11 for both pairs. So where the first PB coupled the 
ensembles of the two pairs, the second decoupled them 
again. 

The same does hold for PB 1111 a followed by 
BPM 11 /9 on the first pair. This is equivalent to 
BPM 11 ^ /3 on the first pair and PB 11 ^ a-h/3 on the 
second pair. And it can be verified that BPM 1111 — > a 
followed by BPM 11 ^ /3 on the first pair is equivalent to 
BPM 11 ^ /? on the first pair and BPM 11 ^ a + ^9 on 
the second pair. This idea was also used in the adaptive 
stabilizer code formalism of Ref. ■ 

However, this decoupling rule does not hold for BPM 
followed by PB. Once we have carried out a BPM on a 
number of qubit pairs, we have statistical dependence not 
only by the knowledge of the overall parity, but also by 
the mapping together of vectors as explained in Sec. IIVI 
It is this dependence that we denote by square brack- 
ets. Although the knowledge on the parities decouples 
by PB, this mapping does not. As an example, let BPM 
1111 followed by PB 1100 have outcome and 1 on two 
particular pairs respectively. The resulting state of the 
pairs is [I'^^U'^^^] and has probabilities: 

POlOl+PlOlO 0^jj(J PoiiQ-Hpiooi 
Pl(2) Pi(2) 

Therefore, once a BPM is carried out on a number of 
qubit pairs, we have to take it into account until it is 
later decoupled by a BPM on some of the qubit pairs. 

We summarize all scenarios (the parity check on a*^^™-' 



is always ImO™): 



state outcome resulting state 



PB 



y(2™) 



BPM 



v(2" 



v(™) 



(3) 



If the considered state was connected to others by pre- 
vious BPM, like in [x a^^™-' y], the state transforms as 
follows: 



state outcome resulting state 



PB [x a(2™) y] 



BPM [x y] 



iM^M y] 



(4) 



[!(")][, 



X a 



(m) 



Note that decoupling is nothing more than linearity of 
parity checks. Whenever we have performed a number 
of parity checks, these generate a space of parity checks. 
Any generating set of this space is equivalent to the orig- 
inal set of parity checks. E.g. {0101, 1010} is equivalent 
to {1010, 1111}. We will use decoupling parity checks 
because they result in a transparent distillation protocol. 



B. Parity checks with equiprobable outcomes 



In Sec. Ill Bl we showed that, once we have performed 
a BPM, we have to make sure that all following parity 
checks commute with it. There is a way in which this is 
automatically achieved. All vectors of the form x (E) 11 
commute (we could also have taken 01 or 10). Indeed, 
for all 2ri-bit vectors x, y, it holds: 



x ® 



1 

1 




Therefore, if we stick to parity checks of this form, we 
do not have to care about commutability any more. In 
this way, for every qubit pair we can find out whether it 
is 0'^' or 1^^'. For now, let us assume we go up to this 
point but not further: we want to find an optimal way of 
reaching the point where every pair is determined as 0^^^ 
or 1(2). 

Whenever we spot parity checks with equiprobable 
outcomes, we should perform it by BPM. We will now ex- 
plain how to do this. Suppose we have to qubit pairs, de- 
termined as 1(2™) by a previous parity check l2m- Then 
the parity check ImOm has equiprobable outcomes. In- 
deed, it holds that 



1 



(2m) 



Q(m)-[^(m) 
l("i)0(™). 



Clearly, both possibilities have the same initial proba- 
bility PQ(m)Pi(m) or 1/2 after normalization. Therefore, 
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performing the parity check !,„ on the left half yields the 
parities of both halfs and this information equals one bit. 
By performing a BPM, we have the extra entropy reduc- 
tion. Furthermore, this BPM decouples the two halves 
of the state. 

However, if the m pairs are 



Q(2m) 



Q(m)Q(m) 



we have the following recursion relation: 
t ^ t{pl+pl) 



Px 

Py 
k 



Px 



pI+pI 



pI 



Px+Py 
2k. 



we do not have equiprobable possibilities. With a little 
trick, we still are able to force an equiprobable outcome 
parity check. Two states of this kind can be written as 



Q(2m)Q(2m) 



Q(m)Q(m) or 

o(™)o('") or 

Q(m)Q(m) Q(m)Q(m) qj. 



With an extra PB 0ml2m0m, we can distinguish the first 
two possibilities from the last two (as indicated by the 
line). If the outcome is 1, again we have two equiproba- 
ble possibilities 0(™)0(™)l(™)l(™) and l(™)l('")o(™)o("), 
that are separated by a BPM 1^ on one of the four m- 
bit vectors. If the outcome is 0, the possibilities are not 
equiprobable, but again we can bring two of these results 
together, with possibilities 



At each step, we have a probability 2pxPy that one of the 
m-bit vectors involved is detemined by BPM. So each 
step yields another fraction 2tpxPy/k of O'^'") on half of 
which a BPM is performed. It can be verified that the 
total sum of these fractions over all steps is equal to 



77(0(2™)) = . (""'^" 

1=0 2» n (v^^+w^^) 

j = 



(5) 



where 



V 



and w — 



In practice, it suffices to truncate the procedure after a 
few steps, since the terms in the summation of Eq. (0) 
decrease exponentially fast. 



C. Numerical calculation of the yield 



Q(m)Q(m)Q(m)Q(m) ^(m) ^(rn) ^(m) -^{m) 

l("»)l("»)l(™)l("i) o('")o(™)o(™)0('") or 

Q(m)Q(m)g(m)g(m) Q{m) Q{m) Q{m) Q{m) 
^{m)^{m)^{m)^{m) ^(m) ^(m) -^(m) ^{m) 

and performing PB 03ml2m03m separating the possibili- 
ties as indicated by the line, and so forth. Clearly, this 
trick can be repeated endlessly. 

We calculate the average fraction 77(0(2™)) of 0(2™) 
on half of which a BPM is performed (note that 
77(1(2™)) = 1). The procedure explained in the previ- 
ous paragraph is recursive: at each step, we combine 
two random variables with two possible values x and y 
{Px + Py = 1 ) • The variables of the next step are xx 
and yy, and so on. Therefore, it is possible to calculate 
77(0(2™)) in a recursive way. Let t be the probability to 
reach the situation under consideration and k the total 
number of 0(2™) involved in the present step. Initially, 
we have 



The protocol starts with PB l2g+i . The next step is 
an iteration of the procedure explained in Sec. IV Bl for 
TO = 2"^, 2"^"^, ... 2, where we use the update rules Q and 
For now, we will treat all 0(2™) in the same way, i.e. 
we do not favour particular states being parity checked 
by BPM. As a consequence, every 0(2™) has the same 
probability 77(9(2™)) of undergoing a BPM l^O^. We 
find that, from one step to the next, the states transform 
as follows: 



state 


transforms to 


with probability 


Q(2m) _ 


[o(™)]o(™) 

[l(™)]l(™) 


77(0(2™) )/2 
77(0(2™))/2 




_> o(™)o(™) 










l(™)l(™) 


P,(m) r,(0(2™)) 






l(2ni) _ 


[o(™)]i(™) 


1/2 




[i(™)]o(™) 


1/2 



(6) 



t 

Px 
Py 



1 



Pot™, 



k = 2. 

From the procedure explained in the previous paragraph. 



With these rules, we are able to calculate the frequencies 
(i.e. the expected number of occurrences per 2'^ qubit 
pairs) of all possibilities from one step to the next. After 
the last step, we are left only with 0(2) and 1(2) pairs, in 
various combinations of BPM (denoted by square brack- 
ets). Within square brackets, permutations of pairs yield 
equivalent states. Therefore, we do not have to calculate 
the frequencies of all possibilities, but only up to a per- 
mutation of the pairs: between square brackets, only the 
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number no of O*-^™^ and ni of matter. We denote 

this by [no,ni]. The possibilities in the end are then: 

0(2), 1(2), 

[1,0], [0,1], 

[2,0], [1,1], [0,2], (7) 

[29,0], [29 -1,1],..., [0,29], 

with frequencies /(0(2)), /(l(2)), /([l, 0]), . . . , /([O, 29]). 
Note that these must satisfy 

5](no(^)+ni(A))/(A)=29, 

A 

where we define no (0(2)) = 1, ni(0(2)) = and no(l(2)) = 
0, ni(l(2)) — 1. By partial breeding alone, nS'^'^^ip) ebits 
would have been sacrificed. Now, for every BPM, we have 
one ebit less that has been measured. Therefore, the total 
cost of ebits per qubit pair up to this point equals 

•^^'Hp)-^ E /(["o,ni]). (8) 

[no.rii] 

But the protocol is not finished yet. Breeding is opti- 
mal for the pairs that have never been involved in some 
BPM, as they are independent rank two Bell diagonal 
states We show that breeding is optimal for all 

pairs. Although equiprobable parity checks can still be 
found, they will no longer result in an entropy reduc- 
tion if carried out by a BPM. Indeed, all further parity 
checks a must be entirely built of 01 and 10, because 
for every pair we already know the parity 11. Therefore, 
Pa too is built of 01 and 10. Since every pair is either 
0(2) = {00, 11} or 1(2) = {01, 10}, the mapping of vectors 
vanishes: one of the two vectors mapped to the same new 
vector has already been ruled out by the parity checks, 
because 0(2) 01 = 0(2) + IQ = 1(2). Deprived of the 
benefit of entropy reduction by BPM, the best thing left 
is to gain one bit of information for every measurement. 
The number of ebits needed per qubit pair equals the 
entropy per pair 




Y.f{A)S{A)j (9) 



left in the overall state. It can be verified that 

5(0(2)) ^ H{q„o,qii), 

5(1(2)) ^ Hiqo,,q,o), (10) 
5([no,ni]) = -\f:f:Mh)pi^,J)log,P{^,Jl 

where goo = T , Qu = ^1! , 

Poi+Pio ' Poi+Pio' 



Now all non- measured qubit pairs are pure ebits. The 
fraction of non-measured pairs equals 

1"^ E fii^o,m]). (11) 

[no,ni] 

If we substract the total number of measured ebits, which 
is the sum of (jHJ and 0, from this value lfTT)l . we get the 
yield of the protocol: 

1-5(2)(p)-1(E/(A)5(A)). (12) 



D. Favouring BPM on a small number of pairs 

It can be verified that the entropy reduction is larger 
for a BPM on a small number of pairs than on a large 
number of pairs. In the first version of our protocol, we 
did not make use of this, since all 0(2™) were treated 
equally. So there is still room for improvement. As an 
example, consider the following situation: 

where all "*" are either 0(™) or l(™), and a parity check 
Im on one of them (with equiprobable outcomes) deter- 
mines them all. Then it is better to do a BPM on one of 
the first three, resulting in 

than on one of the last five, resulting in 
* * 1^*] * ^>(=]. 

Indeed, it can be verified that 5([* * *]) — 5([h"I=]) is larger 
than 5([* ****]) — 5([* * **]). 

We show how to increase the number of BPM on small 
numbers of pairs. At each step, we have 0(2™) and l(2™), 
distributed over all possibilities. We carry out BPM 
ImOm on all 1(2™), so there the situation remains the 
same. But the same cannot be done for all 0(2™): there 
the ones that are linked by BPM (i.e. in square brackets) 
to a small number of pairs, should be taken first. Every 
0(2™) is part of some state A, where no is nonzero. We 
now order all possibilities [no, ni] according to increasing 
n-o + Til and on a second level according to increasing no. 
So for example [5,3] < [6,2] < [4,5]. We favour small 
no on a second level because all 1(2™) will be certainly 
reduced, on average resulting in smaller no and ni in the 
end. We also define that all [no,ni] < 0(2™). Probably 
better orderings can be found, but we do not want to 
complicate things further. We define 

E no{B)f{B) 

L(A) = ^ — 

Po(2-)29/m 

and U{A) with the same formula but "<" replaced by 
"<". L{A) and U{A) are the fractions of all 0(2") that 



9 



are part of some B < A and < A respectively. Note that 
i([l,0]) = and f/(0(2™)) = 1. We combine the O^^™) 
for the procedure explained in Sec. IV Bl as follows: first 
we divide all O^^"*-* in two equally large sets (i.e. both sets 
contain pQ(2m) n/m elements): every 
is part of some A < that of every element of the second 
set. Now every 0*^^™^ of the first set is combined with 
one of the second set and PB Oml2mOm is performed. 
Whenever the outcome is 1 (the probability of which is 
calculated in the same way as in Sec. IV B|l . a BPM 1^0™ 
is performed on the first O^^™). AH 0(2™) 0(2™) with out- 
come are again divided in two halves, according to the 
ordening of every first O*-^™). By continuing in this way, 
the fraction 77(0(2™) |^) of O^^™) , part of some A, on which 
a BPM ImOm is performed, can be calculated, and equals 



where 1{A) 
u{A) 



u(A)~ 

E 



1{A) 

E 

=u(A) 



HA) 



U{A)-L{A)- 



(13) 



= [-\og^ L{A)\, V 

= r-iog2t/(A)i, w 

_ 2(1, t»)'''"" 



i(™) 



l(™) 



n {v^^+w^^) 
3=0 

As in Eq. ((SJ, the terms in the second summation in 
Eq. H13|) decrease exponentially fast. Therefore, when 
I {A) is large, the procedure may be truncated after a 
number of steps. In the update rules ®, 77(0(2™)) must 
be replaced by 77(0(2™) |yl). Note that we have differ- 
ent update rules for different possibilities A. With this, 
we end up with the same possibilities ([TJl but with dif- 
ferent frequencies /(0(2)), /(l(2)), /([1, 0]), . . . , /([O, 2«]). 
To calculate the yield, we still use Eqs. (fTn|l and ifT^ . 



Also notice in Fig. O that the yields of the protocol of 
Sec. IV Ul are larger than the yields for corresponding q of 
that of Sec.rvn 




FIG. 2: the yields of the protocol of Sec. IV Cl fsolid lines), for 
q — 1,2, 3, 4, 5, 6, compared to the yield of breeding (dotted 
line). The yield increases with increasing q and converges for 
large q. 




VI. ILLUSTRATION WITH WERNER STATES 

We have numerically calculated the yield of the pro- 
tocols explained in Sec. for Werner states. Werner 
states are Bell-diagonal states where poo = F and poi = 
piQ = = F is also called the fidelity of the 

state. Werner states are typically the result of one party 
preparing Bell states |i3oo) and sending one qubit of the 
pair to the other party via the depolarization channel 

1 — F 

FpA — {a^pal + aypa] + cr^pfjt) . 

In Figs. 121 and 13 we have plotted the yields of the pro- 
tocols of Sec. |V21 and IVEl for q = 1,2,3,4,5,6. We 
truncate the procedure of Sec. IV Bl atter 10 steps. We see 
that with increasing g, the yields of the protocols increase 
but converge. This is due to the fact that the entropy 
reduction is smaller for BPM on larger numbers of pairs. 



FIG. 3: the yields of the protocol of Sec. IVDl where BPM 
on small numbers of pairs are favoured (solid lines), com- 
pared to the yields of that of Sec. IV Ul (dotted lines), for 
g = 1, 2, 3, 4, 5, 6. Again, the yield increases with increasing q 
and converges for large q. 

We see that the yield of our best protocol is zero when 
F < 0.7424. This is better than breeding (0.8107), but 
in order to distill states with lower fidelity, we first have 
to apply a numer of iterations of recurrence Before 
every recurrence iteration, one-qubit local Clifford op- 
erations, yielding a permutation of the Bell states, are 
applied to each pair such that poo > PoiiPio > Pii for 
the transformed pairs 0, ^| . Recurrence itself consists 
of a BPM fill on every two pairs, after which all re- 
maining pairs where this parity check yielded 1, are dis- 
carded. The remaining pairs where the outcome was 0, 
have higher fidelity and are kept for a next iteration or 
for an asymptotic protocol. Note that the discarding 
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can be interpreted as an extra BPM 1100, which has 
equiprobable outcomes. Therefore, the recurrence itera- 



■o 




F 



FIG. 4: the yield of our best protocol (solid line) and breed- 
ing (dotted line), both preceded by an optimal number of 
recurrence iterations. 



0.7 




0.5 0.6 0.7 0.8 0.9 1 



F 

FIG. 5: the relative difference of the yields. 

tions before our protocol only improve it by the fact that 
also non-equiprobable parity checks are carried out by 
BPM. The not being maximal of the information gain is 
more than compensated by the entropy reduction for low- 
fidelity states. A next generation of protocols should in- 
corporate a more complex criterion for BPM than merely 
equiprobable parity check outcomes, but we will not go 
deeper into that issue. We have compared the yield of 
breeding preceded by recurrence iterations to that of our 
protocol preceded by recurrence iterations in Fig.0] The 



discontinuities in the slope are due to the fact that the 
optimal number of recurrence iterations is dependent on 
the fidelity. We have also plotted the relative difference 
in Fig. |S1 which is the difference of the yields divided by 
the yield of breeding preceded by recurrence iterations. 
The sawtooth-like shape is caused by the fact that the 
discontinuities in the slopes of the yields do not coincide 
for the two protocols. 

VII. CONCLUSION 

We have presented a new asymptotic distillation pro- 
tocol, that, based on the important principle of entropy 
reduction, outperforms all previous asymptotic protocols. 
Doing so, we have shed light on issues that were not clear 
before, such as the reason of the benefit of recurrence. Al- 
though we cannot claim to approach the entanglement of 
distillation, we certainly have tightened its lower bound. 
We also have mentioned roads that are still open for in- 
vestigation. However, we feel that searching for further 
improvement will result in highly complicated protocols, 
possibly the product of an exhaustive search in a super- 
exponential decision space. 
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